Accurate parameter estimation for Bayesian Network Classifiers using Hierarchical Dirichlet Processes

نویسندگان

François Petitjean

Wray L. Buntine

Geoffrey I. Webb

Nayyar A. Zaidi

چکیده

This paper introduces a novel parameter estimation method for the probability tables of Bayesian network classifiers (BNCs), using hierarchical Dirichlet processes (HDPs). The main result of this paper is to show that improved parameter estimation allows BNCs to outperform leading learning methods such as Random Forest for both 0-1 loss and RMSE, albeit just on categorical datasets. As data assets become larger, entering the hyped world of “big”, efficient accurate classification requires three main elements: (1) classifiers with low-bias that can capture the fine-detail of large datasets (2) out-of-core learners that can learn from data without having to hold it all in main memory and (3) models that can classify new data very efficiently. The latest Bayesian network classifiers (BNCs) satisfy these requirements. Their bias can be controlled easily by increasing the number of parents of the nodes in the graph. Their structure can be learned out of core with a limited number of passes over the data. However, as the bias is made lower to accurately model classification tasks, so is the accuracy of their parameters’ estimates, as each parameter is estimated from ever decreasing quantities of data. In this paper, we introduce the use of Hierarchical Dirichlet Processes for accurate BNC parameter estimation. We conduct an extensive set of experiments on 68 standard datasets and demonstrate that our resulting classifiers perform very competitively with Random Forest in terms of prediction, while keeping the out-of-core capability and superior classification time. François Petitjean Faculty of Information Technology, Monash University E-mail: [email protected] Wray Buntine Faculty of Information Technology, Monash University E-mail: [email protected] Geoffrey I. Webb Faculty of Information Technology, Monash University E-mail: [email protected] Nayyar Zaidi Faculty of Information Technology, Monash University E-mail: [email protected] ar X iv :1 70 8. 07 58 1v 2 [ cs .L G ] 1 9 D ec 2 01 7 2 François Petitjean et al.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Bayesian Network Structure using Markov Blanket in K2 Algorithm

‎A Bayesian network is a graphical model that represents a set of random variables and their causal relationship via a Directed Acyclic Graph (DAG)‎. ‎There are basically two methods used for learning Bayesian network‎: ‎parameter-learning and structure-learning‎. ‎One of the most effective structure-learning methods is K2 algorithm‎. ‎Because the performance of the K2 algorithm depends on node...

متن کامل

Bayesian Estimation of Shift Point in Shape Parameter of Inverse Gaussian Distribution Under Different Loss Functions

In this paper, a Bayesian approach is proposed for shift point detection in an inverse Gaussian distribution. In this study, the mean parameter of inverse Gaussian distribution is assumed to be constant and shift points in shape parameter is considered. First the posterior distribution of shape parameter is obtained. Then the Bayes estimators are derived under a class of priors and using variou...

متن کامل

Parameter Estimation for the Latent Dirichlet Allocation

We review three algorithms for parameter estimation of the Latent Dirichlet Allocation model: batch variational Bayesian inference, online variational Bayesian inference and inference using collapsed Gibbs sampling. We experimentally compare their time complexity and performance. We find that the online variational Bayesian inference converges faster than the other two inference techniques, wit...

متن کامل

Multi-Task Classification for Incomplete Data

A non-parametric hierarchical Bayesian framework is developed for designing a sophisticated classifier based on a mixture of simple (linear) classifiers. Each simple classifier is termed a local “expert”, and the number of experts and their construction are manifested via a Dirichlet process formulation. The simple form of the “experts” allows direct handling of incomplete data. The model is fu...

متن کامل

Classical and Bayesian Inference in Two Parameter Exponential Distribution with Randomly Censored Data

Abstract. This paper deals with the classical and Bayesian estimation for two parameter exponential distribution having scale and location parameters with randomly censored data. The censoring time is also assumed to follow a two parameter exponential distribution with different scale but same location parameter. The main stress is on the location parameter in this paper. This parameter has not...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1708.07581 شماره

صفحات -

تاریخ انتشار 2017

Accurate parameter estimation for Bayesian Network Classifiers using Hierarchical Dirichlet Processes

نویسندگان

چکیده

منابع مشابه

Learning Bayesian Network Structure using Markov Blanket in K2 Algorithm

Bayesian Estimation of Shift Point in Shape Parameter of Inverse Gaussian Distribution Under Different Loss Functions

Parameter Estimation for the Latent Dirichlet Allocation

Multi-Task Classification for Incomplete Data

Classical and Bayesian Inference in Two Parameter Exponential Distribution with Randomly Censored Data

عنوان ژورنال:

اشتراک گذاری